AITopics | race 0

Collaborating Authors

race 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Comprehensive Study of Implicit and Explicit Biases in Large Language Models

Kazi, Fatima, Young, Alex, Inani, Yash, Rafatirad, Setareh

arXiv.org Artificial IntelligenceNov-19-2025

Large Language Models (LLMs) inherit explicit and implicit biases from their training datasets. Identifying and mitigating biases in LLMs is crucial to ensure fair outputs, as they can perpetuate harmful stereotypes and misinformation. This study highlights the need to address biases in LLMs amid growing generative AI. We studied bias-specific benchmarks such as StereoSet and CrowSPairs to evaluate the existence of various biases in multiple generative models such as BERT and GPT 3.5. We proposed an automated Bias-Identification Framework to recognize various social biases in LLMs such as gender, race, profession, and religion. We adopted a two-pronged approach to detect explicit and implicit biases in text data. Results indicated fine-tuned models struggle with gender biases but excelled at identifying and avoiding racial biases. Our findings illustrated that despite having some success, LLMs often over-relied on keywords. To illuminate the capability of the analyzed LLMs in detecting implicit biases, we employed Bag-of-Words analysis and unveiled indications of implicit stereotyping within the vocabulary. To bolster the model performance, we applied an enhancement strategy involving fine-tuning models using prompting techniques and data augmentation of the bias benchmarks. The fine-tuned models exhibited promising adaptability during cross-dataset testing and significantly enhanced performance on implicit bias benchmarks, with performance gains of up to 20%.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.14153

Country: North America > United States > California > Yolo County > Davis (0.15)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Add feedback

What does making money have to do with crime?: A dive into the National Crime Victimization survey

Anuyah, Sydney

arXiv.org Artificial IntelligenceJun-6-2025

In this short article, I leverage the National Crime Victimization Survey from 1992 to 2022 to examine how income, education, employment, and key demographic factors shape the type of crime victims experience (violent vs property). Using balanced classification splits and logistic regression models evaluated by F1-score, there is an isolation of the socioeconomic drivers of victimization "Group A" models and then an introduction of demographic factors such as age, gender, race, and marital status controls called "Group B" models. The results consistently proves that higher income and education lower the odds of violent relative to property crime, while men younger individuals and racial minorities face disproportionately higher violentcrime risks. On the geographic spectrum, the suburban models achieve the strongest predictive performance with an accuracy of 0.607 and F1 of 0.590, urban areas benefit from adding education and employment predictors and crime in rural areas are still unpredictable using these current factors. The patterns found in this study shows the need for specific interventions like educational investments in metropolitan settings economic support in rural communities and demographicaware prevention strategies.

artificial intelligence, crime, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.0424

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Add feedback

Fairness Testing through Extreme Value Theory

Monjezi, Verya, Trivedi, Ashutosh, Kreinovich, Vladik, Tizpaz-Niari, Saeid

arXiv.org Artificial IntelligenceJan-20-2025

Data-driven software is increasingly being used as a critical component of automated decision-support systems. Since this class of software learns its logic from historical data, it can encode or amplify discriminatory practices. Previous research on algorithmic fairness has focused on improving average-case fairness. On the other hand, fairness at the extreme ends of the spectrum, which often signifies lasting and impactful shifts in societal attitudes, has received significantly less emphasis. Leveraging the statistics of extreme value theory (EVT), we propose a novel fairness criterion called extreme counterfactual discrimination (ECD). This criterion estimates the worst-case amounts of disadvantage in outcomes for individuals solely based on their memberships in a protected group. Utilizing tools from search-based software engineering and generative AI, we present a randomized algorithm that samples a statistically significant set of points from the tail of ML outcome distributions even if the input dataset lacks a sufficient number of relevant samples. We conducted several experiments on four ML models (deep neural networks, logistic regression, and random forests) over 10 socially relevant tasks from the literature on algorithmic fairness. First, we evaluate the generative AI methods and find that they generate sufficient samples to infer valid EVT distribution in 95% of cases. Remarkably, we found that the prevalent bias mitigators reduce the average-case discrimination but increase the worst-case discrimination significantly in 5% of cases. We also observed that even the tail-aware mitigation algorithm -- MiniMax-Fairness -- increased the worst-case discrimination in 30% of cases. We propose a novel ECD-based mitigator that improves fairness in the tail in 90% of cases with no degradation of the average-case discrimination.

discrimination, iii-finite linear, sex 0, (15 more...)

arXiv.org Artificial Intelligence

2501.11597

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Texas (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)

Add feedback

Fair4Free: Generating High-fidelity Fair Synthetic Samples using Data Free Distillation

Sikder, Md Fahim, de Leng, Daniel, Heintz, Fredrik

arXiv.org Artificial IntelligenceOct-2-2024

This work presents Fair4Free, a novel generative model to generate synthetic fair data using data-free distillation in the latent space. Fair4Free can work on the situation when the data is private or inaccessible. In our approach, we first train a teacher model to create fair representation and then distil the knowledge to a student model (using a smaller architecture). The process of distilling the student model is data-free, i.e. the student model does not have access to the training dataset while distilling. After the distillation, we use the distilled model to generate fair synthetic samples. Our extensive experiments show that our synthetic samples outperform state-of-the-art models in all three criteria (fairness, utility and synthetic quality) with a performance increase of 5% for fairness, 8% for utility and 12% in synthetic quality for both tabular and image datasets. Nowadays, people rely on Artificial Intelligence-based applications to seek answers or make decisions. These AI-based models are trained with the data available in the real world.

dataset, fair representation, synthetic sample, (16 more...)

arXiv.org Artificial Intelligence

2410.01423

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)
Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generating Synthetic Fair Syntax-agnostic Data by Learning and Distilling Fair Representation

Sikder, Md Fahim, Ramachandranpillai, Resmi, de Leng, Daniel, Heintz, Fredrik

arXiv.org Artificial IntelligenceAug-20-2024

Data Fairness is a crucial topic due to the recent wide usage of AI powered applications. Most of the real-world data is filled with human or machine biases and when those data are being used to train AI models, there is a chance that the model will reflect the bias in the training data. Existing bias-mitigating generative methods based on GANs, Diffusion models need in-processing fairness objectives and fail to consider computational overhead while choosing computationally-heavy architectures, which may lead to high computational demands, instability and poor optimization performance. To mitigate this issue, in this work, we present a fair data generation technique based on knowledge distillation, where we use a small architecture to distill the fair representation in the latent space. The idea of fair latent space distillation enables more flexible and stable training of Fair Generative Models (FGMs). We first learn a syntax-agnostic (for any data type) fair representation of the data, followed by distillation in the latent space into a smaller model. After distillation, we use the distilled fair latent space to generate high-fidelity fair synthetic data. While distilling, we employ quality loss (for fair distillation) and utility loss (for data utility) to ensure that the fairness and data utility characteristics remain in the distilled latent space. Our approaches show a 5%, 5% and 10% rise in performance in fairness, synthetic sample quality and data utility, respectively, than the state-of-the-art fair generative model.

distillation, latent space, representation, (17 more...)

arXiv.org Artificial Intelligence

2408.10755

Country:

North America > United States (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

FairX: A comprehensive benchmarking tool for model analysis using fairness, utility, and explainability

Sikder, Md Fahim, Ramachandranpillai, Resmi, de Leng, Daniel, Heintz, Fredrik

arXiv.org Artificial IntelligenceJun-20-2024

We present FairX, an open-source Python-based benchmarking tool designed for the comprehensive analysis of models under the umbrella of fairness, utility, and eXplainability (XAI). FairX enables users to train benchmarking bias-removal models and evaluate their fairness using a wide array of fairness metrics, data utility metrics, and generate explanations for model predictions, all within a unified framework. Existing benchmarking tools do not have the way to evaluate synthetic data generated from fair generative models, also they do not have the support for training fair generative models either. In FairX, we add fair generative models in the collection of our fair-model library (pre-processing, in-processing, post-processing) and evaluation metrics for evaluating the quality of synthetic fair data. This version of FairX supports both tabular and image datasets. It also allows users to provide their own custom datasets. The open-source FairX benchmarking package is publicly available at https://github.com/fahim-sikder/FairX.

dataset, fairx, synthetic data, (15 more...)

arXiv.org Artificial Intelligence

2406.14281

Country:

North America > United States (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Does Machine Bring in Extra Bias in Learning? Approximating Fairness in Models Promptly

Bian, Yijun, Luo, Yujie

arXiv.org Artificial IntelligenceMay-15-2024

Providing various machine learning (ML) applications in the real world, concerns about discrimination hidden in ML models are growing, particularly in high-stakes domains. Existing techniques for assessing the discrimination level of ML models include commonly used group and individual fairness measures. However, these two types of fairness measures are usually hard to be compatible with each other, and even two different group fairness measures might be incompatible as well. To address this issue, we investigate to evaluate the discrimination level of classifiers from a manifold perspective and propose a " harmonic fairness measure via manifolds (HFM) " based on distances between sets. Y et the direct calculation of distances might be too expensive to afford, reducing its practical applicability. Therefore, we devise an approximation algorithm named " Approximation of distance between sets (ApproxDist) " to facilitate accurate estimation of distances, and we further demonstrate its algorithmic effectiveness under certain reasonable assumptions. Empirical results indicate that the proposed fairness measure HFM is valid and that the proposed ApproxDist is effective and efficient.

approxdist, fairness, fairness measure, (16 more...)

arXiv.org Artificial Intelligence

2405.09251

Country:

Asia > Singapore (0.04)
North America > United States > Florida > Broward County (0.04)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Bias Neutralization Framework: Measuring Fairness in Large Language Models with Bias Intelligence Quotient (BiQ)

Narayan, Malur, Pasmore, John, Sampaio, Elton, Raghavan, Vijay, Waters, Gabriella

arXiv.org Artificial IntelligenceApr-28-2024

The burgeoning influence of Large Language Models (LLMs) in shaping public discourse and decision-making underscores the imperative to address inherent biases within these AI systems. In the wake of AI's expansive integration across sectors, addressing racial bias in LLMs has never been more critical. This paper introduces a novel framework called Comprehensive Bias Neutralization Framework (CBNF) which embodies an innovative approach to quantifying and mitigating biases within LLMs. Our framework builds on the Large Language Model Bias Index (LLMBI) [Oketunji, A., Anas, M., Saina, D., (2023)] and Bias removaL with No Demographics (BLIND) [Orgad, H., Belinkov, Y. (2023)] methodologies to create a new metric called Bias Intelligence Quotient (BiQ) which detects, measures, and mitigates racial bias in LLMs without reliance on demographic annotations. By introducing a new metric called BiQ that enhances LLMBI with additional fairness metrics, CBNF offers a multi-dimensional metric for bias assessment, underscoring the necessity of a nuanced approach to fairness in AI [Mehrabi et al., 2021]. This paper presents a detailed analysis of Latimer AI (a language model incrementally trained on black history and culture) in comparison to ChatGPT 3.5, illustrating Latimer AI's efficacy in detecting racial, cultural, and gender biases through targeted training and refined bias mitigation strategies [Latimer & Bender, 2023]. Through empirical studies, our approach not only demonstrates the feasibility of detecting and measuring racial bias but also offers a scalable solution adaptable to various AI applications, underscoring our commitment to fostering more equitable and reliable AI technologies. Our method focuses on providing a comprehensive framework for the detection, quantification, and mitigation of racial biases in monolingual LLMs, with a special emphasis on Retrieval Augmented Generation (RAG) based models [Lewis, Perez et.al.

llm, race 0, race 1, (14 more...)

arXiv.org Artificial Intelligence

2404.18276

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.84)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback